AITopics | massive language model

Collaborating Authors

massive language model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

Frantar, Elias, Alistarh, Dan

arXiv.org Artificial IntelligenceMar-22-2023

We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models. We can execute SparseGPT on the largest available open-source models, OPT-175B and BLOOM-176B, in under 4.5 hours, and can reach 60% unstructured sparsity with negligible increase in perplexity: remarkably, more than 100 billion weights from these models can be ignored at inference time. SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible with weight quantization approaches. The code is available at: https://github.com/IST-DASLab/sparsegpt.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.00774

Country:

Europe > Austria (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Nvidia makes massive language model available to enterprises

#artificialintelligenceNov-10-2021, 02:05:13 GMT

Let the OSS Enterprise newsletter guide your open source journey! At its fall 2021 GPU Technology Conference (GTC) today, Nvidia announced that it's making Megatron 530B, one of the world's largest language models, available to enterprises for training to serve new domains and languages. First detailed in early October, Megatron 530B -- also known as Megatron-Turing Natural Language Generation (MT-NLP) -- contains 530 billion parameters and achieves high accuracy in a broad set of natural language tasks, including reading comprehension, commonsense reasoning, and natural language inference. "Today, we provide recipes for customers to build, train, and customize large language models, including Megatron 530B. This includes scripts, code, and 530B untrained model. Customers can start from smaller models and scale up to larger models as they see fit," Nvidia VP of AI software product management Kari Briski told VentureBeat via email.

language model, massive language model, nvidia, (14 more...)

#artificialintelligence

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.05)
Asia > China > Beijing > Beijing (0.05)

Industry: Information Technology > Hardware (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

AI21 Labs has trained a massive language model to give a harsh rivalry to OpenAI's GPT-3

#artificialintelligenceAug-18-2021, 19:55:16 GMT

AI21 Labs: OpenAI's GPT-3 is the better part of a year and remained among the largest Artificial Intelligence system in the terms of language models which is ever been created or came into existence. With the help of an API, it has become so easy to use that people are using it for automatically writing the articles and emails along with summarizing the texts, composition of poetries and recipes, generating the codes for deep learning in Python, and creating layouts and templates for websites. But now an Artificial Intelligence lab is based in Tel Aviv, Israel which is named AI21 Labs which stated that they are planning to release a larger model and make it available via a service with the idea of being challenged by OpenAI's dominance in the Natural Language Processing as a service for the development of the Artificial Intelligence field. The startup stated that the largest version of their Artificial Intelligence model is known as Jurassic-1 Jumbo which contains 178 billion parameters and more than 3 billion GPT-3. Taking a look towards Artificial Intelligence along with machine learning parameters are the most important part of the model that is learned from historical training data.

ai21 lab, gpt-3, massive language model, (3 more...)

#artificialintelligence

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.27)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)

Add feedback

AI21 Labs trains a massive language model to rival OpenAI's GPT-3

#artificialintelligenceAug-17-2021, 13:50:05 GMT

The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. For the better part of a year, OpenAI's GPT-3 has remained among the largest AI language models ever created, if not the largest of its kind. Via an API, people have used it to automatically write emails and articles, summarize text, compose poetry and recipes, create website layouts, and generate code for deep learning in Python. But an AI lab based in Tel Aviv, Israel -- AI21 Labs -- says it's planning to release a larger model and make it available via a service, with the idea being to challenge OpenAI's dominance in the "natural language processing-as-a-service" field. The startup says that the largest version of its model -- called Jurassic-1 Jumbo -- contains 178 billion parameters, or 3 billion more than GPT-3 (but not more than PanGu-Alpha, HyperCLOVA, or Wu Dao 2.0).

ai21 lab, jurassic-1 model, language model, (14 more...)

#artificialintelligence

Country: